Dataset statistics
| Number of variables | 17 |
|---|---|
| Number of observations | 96643 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 11.9 MiB |
| Average record size in memory | 129.0 B |
Variable types
| NUM | 15 |
|---|---|
| CAT | 2 |
dob_year is highly correlated with age | High correlation |
age is highly correlated with dob_year | High correlation |
mobile_likes_received is highly correlated with likes_received | High correlation |
likes_received is highly correlated with mobile_likes_received and 1 other fields | High correlation |
www_likes_received is highly correlated with likes_received | High correlation |
likes_received is highly skewed (γ1 = 111.2322131) | Skewed |
mobile_likes_received is highly skewed (γ1 = 107.0512479) | Skewed |
www_likes_received is highly skewed (γ1 = 125.00983) | Skewed |
df_index has unique values | Unique |
userid has unique values | Unique |
friend_count has 1894 (2.0%) zeros | Zeros |
friendships_initiated has 2922 (3.0%) zeros | Zeros |
likes has 22042 (22.8%) zeros | Zeros |
likes_received has 24141 (25.0%) zeros | Zeros |
mobile_likes has 34290 (35.5%) zeros | Zeros |
mobile_likes_received has 29589 (30.6%) zeros | Zeros |
www_likes has 59980 (62.1%) zeros | Zeros |
www_likes_received has 36336 (37.6%) zeros | Zeros |
Reproduction
| Analysis started | 2021-01-24 14:06:47.663964 |
|---|---|
| Analysis finished | 2021-01-24 14:07:25.313559 |
| Duration | 37.65 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 96643 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 49117.06887 |
|---|---|
| Minimum | 0 |
| Maximum | 99002 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4913.1 |
| Q1 | 24420.5 |
| median | 48935 |
| Q3 | 73703.5 |
| 95-th percentile | 93922.9 |
| Maximum | 99002 |
| Range | 99002 |
| Interquartile range (IQR) | 49283 |
Descriptive statistics
| Standard deviation | 28512.14914 |
|---|---|
| Coefficient of variation (CV) | 0.580493702 |
| Kurtosis | -1.195116869 |
| Mean | 49117.06887 |
| Median Absolute Deviation (MAD) | 24639 |
| Skewness | 0.01663037795 |
| Sum | 4746820887 |
| Variance | 812942648.6 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 88766 | 1 | < 0.1% | |
| 68276 | 1 | < 0.1% | |
| 66229 | 1 | < 0.1% | |
| 72374 | 1 | < 0.1% | |
| 70327 | 1 | < 0.1% | |
| 92856 | 1 | < 0.1% | |
| 96954 | 1 | < 0.1% | |
| 94907 | 1 | < 0.1% | |
| 84668 | 1 | < 0.1% | |
| Other values (96633) | 96633 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 99002 | 1 | < 0.1% | |
| 99001 | 1 | < 0.1% | |
| 99000 | 1 | < 0.1% | |
| 98999 | 1 | < 0.1% | |
| 98998 | 1 | < 0.1% |
| Distinct | 96643 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1597170.923 |
|---|---|
| Minimum | 1000008 |
| Maximum | 2193542 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 1000008 |
|---|---|
| 5-th percentile | 1060688.2 |
| Q1 | 1299125 |
| median | 1596245 |
| Q3 | 1895876 |
| 95-th percentile | 2133390.8 |
| Maximum | 2193542 |
| Range | 1193534 |
| Interquartile range (IQR) | 596751 |
Descriptive statistics
| Standard deviation | 344021.4764 |
|---|---|
| Coefficient of variation (CV) | 0.215394277 |
| Kurtosis | -1.199343009 |
| Mean | 1597170.923 |
| Median Absolute Deviation (MAD) | 298352 |
| Skewness | -0.0002675454002 |
| Sum | 1.543553895e+11 |
| Variance | 1.183507762e+11 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1443839 | 1 | < 0.1% | |
| 1907510 | 1 | < 0.1% | |
| 2083628 | 1 | < 0.1% | |
| 1139607 | 1 | < 0.1% | |
| 2024467 | 1 | < 0.1% | |
| 2085679 | 1 | < 0.1% | |
| 1179569 | 1 | < 0.1% | |
| 1778481 | 1 | < 0.1% | |
| 1651507 | 1 | < 0.1% | |
| 1382211 | 1 | < 0.1% | |
| Other values (96633) | 96633 | > 99.9% |
| Value | Count | Frequency (%) | |
| 1000008 | 1 | < 0.1% | |
| 1000013 | 1 | < 0.1% | |
| 1000038 | 1 | < 0.1% | |
| 1000059 | 1 | < 0.1% | |
| 1000061 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2193542 | 1 | < 0.1% | |
| 2193538 | 1 | < 0.1% | |
| 2193522 | 1 | < 0.1% | |
| 2193499 | 1 | < 0.1% | |
| 2193485 | 1 | < 0.1% |
| Distinct | 93 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.66490072 |
|---|---|
| Minimum | 13 |
| Maximum | 105 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 13 |
|---|---|
| 5-th percentile | 15 |
| Q1 | 20 |
| median | 28 |
| Q3 | 48 |
| 95-th percentile | 73 |
| Maximum | 105 |
| Range | 92 |
| Interquartile range (IQR) | 28 |
Descriptive statistics
| Standard deviation | 20.13183582 |
|---|---|
| Coefficient of variation (CV) | 0.5644719435 |
| Kurtosis | 1.275521029 |
| Mean | 35.66490072 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 1.295347135 |
| Sum | 3446763 |
| Variance | 405.2908136 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 18 | 5188 | 5.4% | |
| 23 | 4393 | 4.5% | |
| 19 | 4385 | 4.5% | |
| 20 | 3767 | 3.9% | |
| 21 | 3667 | 3.8% | |
| 25 | 3629 | 3.8% | |
| 17 | 3279 | 3.4% | |
| 16 | 3081 | 3.2% | |
| 22 | 3030 | 3.1% | |
| 24 | 2827 | 2.9% | |
| Other values (83) | 59397 | 61.5% |
| Value | Count | Frequency (%) | |
| 13 | 478 | 0.5% | |
| 14 | 1920 | 2.0% | |
| 15 | 2616 | 2.7% | |
| 16 | 3081 | 3.2% | |
| 17 | 3279 | 3.4% |
| Value | Count | Frequency (%) | |
| 105 | 76 | 0.1% | |
| 104 | 73 | 0.1% | |
| 103 | 1036 | 1.1% | |
| 102 | 185 | 0.2% | |
| 101 | 155 | 0.2% |
dob_day
Real number (ℝ≥0)
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.53257867 |
|---|---|
| Minimum | 1 |
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 7 |
| median | 14 |
| Q3 | 22 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 9.000553099 |
|---|---|
| Coefficient of variation (CV) | 0.6193362724 |
| Kurtosis | -1.186798032 |
| Mean | 14.53257867 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.1077744907 |
| Sum | 1404472 |
| Variance | 81.00995609 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 7606 | 7.9% | |
| 10 | 3946 | 4.1% | |
| 15 | 3498 | 3.6% | |
| 5 | 3445 | 3.6% | |
| 12 | 3333 | 3.4% | |
| 2 | 3329 | 3.4% | |
| 3 | 3237 | 3.3% | |
| 17 | 3204 | 3.3% | |
| 20 | 3194 | 3.3% | |
| 4 | 3154 | 3.3% | |
| Other values (21) | 58697 | 60.7% |
| Value | Count | Frequency (%) | |
| 1 | 7606 | 7.9% | |
| 2 | 3329 | 3.4% | |
| 3 | 3237 | 3.3% | |
| 4 | 3154 | 3.3% | |
| 5 | 3445 | 3.6% |
| Value | Count | Frequency (%) | |
| 31 | 1442 | 1.5% | |
| 30 | 2458 | 2.5% | |
| 29 | 2434 | 2.5% | |
| 28 | 2877 | 3.0% | |
| 27 | 2702 | 2.8% |
| Distinct | 93 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1977.335099 |
|---|---|
| Minimum | 1908 |
| Maximum | 2000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 1908 |
|---|---|
| 5-th percentile | 1940 |
| Q1 | 1965 |
| median | 1985 |
| Q3 | 1993 |
| 95-th percentile | 1998 |
| Maximum | 2000 |
| Range | 92 |
| Interquartile range (IQR) | 28 |
Descriptive statistics
| Standard deviation | 20.13183582 |
|---|---|
| Coefficient of variation (CV) | 0.01018129695 |
| Kurtosis | 1.275521029 |
| Mean | 1977.335099 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | -1.295347135 |
| Sum | 191095596 |
| Variance | 405.2908136 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1995 | 5188 | 5.4% | |
| 1990 | 4393 | 4.5% | |
| 1994 | 4385 | 4.5% | |
| 1993 | 3767 | 3.9% | |
| 1992 | 3667 | 3.8% | |
| 1988 | 3629 | 3.8% | |
| 1996 | 3279 | 3.4% | |
| 1997 | 3081 | 3.2% | |
| 1991 | 3030 | 3.1% | |
| 1989 | 2827 | 2.9% | |
| Other values (83) | 59397 | 61.5% |
| Value | Count | Frequency (%) | |
| 1908 | 76 | 0.1% | |
| 1909 | 73 | 0.1% | |
| 1910 | 1036 | 1.1% | |
| 1911 | 185 | 0.2% | |
| 1912 | 155 | 0.2% |
| Value | Count | Frequency (%) | |
| 2000 | 478 | 0.5% | |
| 1999 | 1920 | 2.0% | |
| 1998 | 2616 | 2.7% | |
| 1997 | 3081 | 3.2% | |
| 1996 | 3279 | 3.4% |
dob_month
Real number (ℝ≥0)
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.288701717 |
|---|---|
| Minimum | 1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 9 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.525916743 |
|---|---|
| Coefficient of variation (CV) | 0.5606748264 |
| Kurtosis | -1.238341143 |
| Mean | 6.288701717 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.0300618739 |
| Sum | 607759 |
| Variance | 12.43208888 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 11397 | 11.8% | |
| 10 | 8274 | 8.6% | |
| 5 | 8101 | 8.4% | |
| 8 | 8095 | 8.4% | |
| 3 | 7910 | 8.2% | |
| 7 | 7848 | 8.1% | |
| 9 | 7752 | 8.0% | |
| 12 | 7696 | 8.0% | |
| 4 | 7639 | 7.9% | |
| 2 | 7449 | 7.7% | |
| Other values (2) | 14482 | 15.0% |
| Value | Count | Frequency (%) | |
| 1 | 11397 | 11.8% | |
| 2 | 7449 | 7.7% | |
| 3 | 7910 | 8.2% | |
| 4 | 7639 | 7.9% | |
| 5 | 8101 | 8.4% |
| Value | Count | Frequency (%) | |
| 12 | 7696 | 8.0% | |
| 11 | 7045 | 7.3% | |
| 10 | 8274 | 8.6% | |
| 9 | 7752 | 8.0% | |
| 8 | 8095 | 8.4% |
gender
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 755.0 KiB |
| male | |
|---|---|
| female |
| Value | Count | Frequency (%) | |
| male | 57239 | 59.2% | |
| female | 39404 | 40.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.815454818 |
| Min length | 4 |
tenure
Real number (ℝ≥0)
| Distinct | 2380 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 523.0575106 |
|---|---|
| Minimum | 1 |
| Maximum | 3139 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 47 |
| Q1 | 224 |
| median | 407 |
| Q3 | 658 |
| 95-th percentile | 1535 |
| Maximum | 3139 |
| Range | 3138 |
| Interquartile range (IQR) | 434 |
Descriptive statistics
| Standard deviation | 440.4878077 |
|---|---|
| Coefficient of variation (CV) | 0.8421402977 |
| Kurtosis | 2.412641644 |
| Mean | 523.0575106 |
| Median Absolute Deviation (MAD) | 207 |
| Skewness | 1.570336249 |
| Sum | 50549847 |
| Variance | 194029.5087 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 300 | 173 | 0.2% | |
| 303 | 170 | 0.2% | |
| 272 | 162 | 0.2% | |
| 257 | 161 | 0.2% | |
| 242 | 161 | 0.2% | |
| 280 | 159 | 0.2% | |
| 297 | 159 | 0.2% | |
| 278 | 158 | 0.2% | |
| 285 | 158 | 0.2% | |
| 284 | 157 | 0.2% | |
| Other values (2370) | 95025 | 98.3% |
| Value | Count | Frequency (%) | |
| 1 | 60 | 0.1% | |
| 2 | 71 | 0.1% | |
| 3 | 78 | 0.1% | |
| 4 | 86 | 0.1% | |
| 5 | 90 | 0.1% |
| Value | Count | Frequency (%) | |
| 3139 | 1 | < 0.1% | |
| 3128 | 1 | < 0.1% | |
| 3019 | 1 | < 0.1% | |
| 2822 | 1 | < 0.1% | |
| 2716 | 1 | < 0.1% |
| Distinct | 2522 |
|---|---|
| Distinct (%) | 2.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 192.8686713 |
|---|---|
| Minimum | 0 |
| Maximum | 4923 |
| Zeros | 1894 |
| Zeros (%) | 2.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 30 |
| median | 80 |
| Q3 | 202 |
| 95-th percentile | 707 |
| Maximum | 4923 |
| Range | 4923 |
| Interquartile range (IQR) | 172 |
Descriptive statistics
| Standard deviation | 383.1613042 |
|---|---|
| Coefficient of variation (CV) | 1.986643562 |
| Kurtosis | 51.26205372 |
| Mean | 192.8686713 |
| Median Absolute Deviation (MAD) | 62 |
| Skewness | 6.128301581 |
| Sum | 18639407 |
| Variance | 146812.5851 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 1894 | 2.0% | |
| 1 | 1810 | 1.9% | |
| 2 | 1110 | 1.1% | |
| 3 | 856 | 0.9% | |
| 5 | 781 | 0.8% | |
| 4 | 742 | 0.8% | |
| 10 | 734 | 0.8% | |
| 24 | 727 | 0.8% | |
| 29 | 716 | 0.7% | |
| 6 | 710 | 0.7% | |
| Other values (2512) | 86563 | 89.6% |
| Value | Count | Frequency (%) | |
| 0 | 1894 | 2.0% | |
| 1 | 1810 | 1.9% | |
| 2 | 1110 | 1.1% | |
| 3 | 856 | 0.9% | |
| 4 | 742 | 0.8% |
| Value | Count | Frequency (%) | |
| 4923 | 1 | < 0.1% | |
| 4917 | 1 | < 0.1% | |
| 4863 | 1 | < 0.1% | |
| 4845 | 1 | < 0.1% | |
| 4844 | 1 | < 0.1% |
| Distinct | 1507 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 106.0517368 |
|---|---|
| Minimum | 0 |
| Maximum | 4144 |
| Zeros | 2922 |
| Zeros (%) | 3.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 16 |
| median | 45 |
| Q3 | 115 |
| 95-th percentile | 413 |
| Maximum | 4144 |
| Range | 4144 |
| Interquartile range (IQR) | 99 |
Descriptive statistics
| Standard deviation | 187.7078064 |
|---|---|
| Coefficient of variation (CV) | 1.769964473 |
| Kurtosis | 43.77488642 |
| Mean | 106.0517368 |
| Median Absolute Deviation (MAD) | 36 |
| Skewness | 5.222329218 |
| Sum | 10249158 |
| Variance | 35234.22058 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 2922 | 3.0% | |
| 1 | 2205 | 2.3% | |
| 2 | 1527 | 1.6% | |
| 3 | 1345 | 1.4% | |
| 4 | 1338 | 1.4% | |
| 6 | 1315 | 1.4% | |
| 5 | 1314 | 1.4% | |
| 11 | 1304 | 1.3% | |
| 8 | 1296 | 1.3% | |
| 13 | 1263 | 1.3% | |
| Other values (1497) | 80814 | 83.6% |
| Value | Count | Frequency (%) | |
| 0 | 2922 | 3.0% | |
| 1 | 2205 | 2.3% | |
| 2 | 1527 | 1.6% | |
| 3 | 1345 | 1.4% | |
| 4 | 1338 | 1.4% |
| Value | Count | Frequency (%) | |
| 4144 | 1 | < 0.1% | |
| 3654 | 1 | < 0.1% | |
| 3594 | 1 | < 0.1% | |
| 3538 | 1 | < 0.1% | |
| 3415 | 1 | < 0.1% |
| Distinct | 2908 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 156.4591641 |
|---|---|
| Minimum | 0 |
| Maximum | 25111 |
| Zeros | 22042 |
| Zeros (%) | 22.8% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 11 |
| Q3 | 80 |
| 95-th percentile | 728 |
| Maximum | 25111 |
| Range | 25111 |
| Interquartile range (IQR) | 79 |
Descriptive statistics
| Standard deviation | 576.2652841 |
|---|---|
| Coefficient of variation (CV) | 3.683167344 |
| Kurtosis | 199.3027162 |
| Mean | 156.4591641 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 11.00879469 |
| Sum | 15120683 |
| Variance | 332081.6776 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 22042 | 22.8% | |
| 1 | 6800 | 7.0% | |
| 2 | 4357 | 4.5% | |
| 3 | 3172 | 3.3% | |
| 4 | 2432 | 2.5% | |
| 5 | 1979 | 2.0% | |
| 6 | 1753 | 1.8% | |
| 7 | 1573 | 1.6% | |
| 8 | 1397 | 1.4% | |
| 9 | 1351 | 1.4% | |
| Other values (2898) | 49787 | 51.5% |
| Value | Count | Frequency (%) | |
| 0 | 22042 | 22.8% | |
| 1 | 6800 | 7.0% | |
| 2 | 4357 | 4.5% | |
| 3 | 3172 | 3.3% | |
| 4 | 2432 | 2.5% |
| Value | Count | Frequency (%) | |
| 25111 | 1 | < 0.1% | |
| 21652 | 1 | < 0.1% | |
| 16732 | 1 | < 0.1% | |
| 16583 | 1 | < 0.1% | |
| 14799 | 1 | < 0.1% |
| Distinct | 2661 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 143.331219 |
|---|---|
| Minimum | 0 |
| Maximum | 261197 |
| Zeros | 24141 |
| Zeros (%) | 25.0% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 8 |
| Q3 | 58 |
| 95-th percentile | 563.9 |
| Maximum | 261197 |
| Range | 261197 |
| Interquartile range (IQR) | 57 |
Descriptive statistics
| Standard deviation | 1402.517934 |
|---|---|
| Coefficient of variation (CV) | 9.785153184 |
| Kurtosis | 17078.86114 |
| Mean | 143.331219 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 111.2322131 |
| Sum | 13851959 |
| Variance | 1967056.556 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 24141 | 25.0% | |
| 1 | 7166 | 7.4% | |
| 2 | 4460 | 4.6% | |
| 3 | 3280 | 3.4% | |
| 4 | 2600 | 2.7% | |
| 5 | 2310 | 2.4% | |
| 6 | 1820 | 1.9% | |
| 7 | 1645 | 1.7% | |
| 8 | 1488 | 1.5% | |
| 9 | 1321 | 1.4% | |
| Other values (2651) | 46412 | 48.0% |
| Value | Count | Frequency (%) | |
| 0 | 24141 | 25.0% | |
| 1 | 7166 | 7.4% | |
| 2 | 4460 | 4.6% | |
| 3 | 3280 | 3.4% | |
| 4 | 2600 | 2.7% |
| Value | Count | Frequency (%) | |
| 261197 | 1 | < 0.1% | |
| 178166 | 1 | < 0.1% | |
| 152014 | 1 | < 0.1% | |
| 106025 | 1 | < 0.1% | |
| 82623 | 1 | < 0.1% |
| Distinct | 2385 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 106.7407572 |
|---|---|
| Minimum | 0 |
| Maximum | 25111 |
| Zeros | 34290 |
| Zeros (%) | 35.5% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 4 |
| Q3 | 46 |
| 95-th percentile | 484 |
| Maximum | 25111 |
| Range | 25111 |
| Interquartile range (IQR) | 46 |
Descriptive statistics
| Standard deviation | 448.7178652 |
|---|---|
| Coefficient of variation (CV) | 4.203810025 |
| Kurtosis | 358.1396129 |
| Mean | 106.7407572 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 14.13021401 |
| Sum | 10315747 |
| Variance | 201347.7226 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 34290 | 35.5% | |
| 1 | 6163 | 6.4% | |
| 2 | 3851 | 4.0% | |
| 3 | 2844 | 2.9% | |
| 4 | 2195 | 2.3% | |
| 5 | 1744 | 1.8% | |
| 6 | 1555 | 1.6% | |
| 7 | 1359 | 1.4% | |
| 8 | 1186 | 1.2% | |
| 9 | 1114 | 1.2% | |
| Other values (2375) | 40342 | 41.7% |
| Value | Count | Frequency (%) | |
| 0 | 34290 | 35.5% | |
| 1 | 6163 | 6.4% | |
| 2 | 3851 | 4.0% | |
| 3 | 2844 | 2.9% | |
| 4 | 2195 | 2.3% |
| Value | Count | Frequency (%) | |
| 25111 | 1 | < 0.1% | |
| 21652 | 1 | < 0.1% | |
| 16732 | 1 | < 0.1% | |
| 14039 | 1 | < 0.1% | |
| 13529 | 1 | < 0.1% |
| Distinct | 1991 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 84.47543019 |
|---|---|
| Minimum | 0 |
| Maximum | 138561 |
| Zeros | 29589 |
| Zeros (%) | 30.6% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 4 |
| Q3 | 33 |
| 95-th percentile | 319 |
| Maximum | 138561 |
| Range | 138561 |
| Interquartile range (IQR) | 33 |
Descriptive statistics
| Standard deviation | 847.6838415 |
|---|---|
| Coefficient of variation (CV) | 10.03467919 |
| Kurtosis | 15322.77126 |
| Mean | 84.47543019 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 107.0512479 |
| Sum | 8163959 |
| Variance | 718567.8951 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 29589 | 30.6% | |
| 1 | 8079 | 8.4% | |
| 2 | 4842 | 5.0% | |
| 3 | 3516 | 3.6% | |
| 4 | 2861 | 3.0% | |
| 5 | 2322 | 2.4% | |
| 6 | 1958 | 2.0% | |
| 7 | 1694 | 1.8% | |
| 8 | 1465 | 1.5% | |
| 9 | 1396 | 1.4% | |
| Other values (1981) | 38921 | 40.3% |
| Value | Count | Frequency (%) | |
| 0 | 29589 | 30.6% | |
| 1 | 8079 | 8.4% | |
| 2 | 4842 | 5.0% | |
| 3 | 3516 | 3.6% | |
| 4 | 2861 | 3.0% |
| Value | Count | Frequency (%) | |
| 138561 | 1 | < 0.1% | |
| 131244 | 1 | < 0.1% | |
| 89911 | 1 | < 0.1% | |
| 73333 | 1 | < 0.1% | |
| 43410 | 1 | < 0.1% |
| Distinct | 1712 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 49.71835518 |
|---|---|
| Minimum | 0 |
| Maximum | 14865 |
| Zeros | 59980 |
| Zeros (%) | 62.1% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 6 |
| 95-th percentile | 205 |
| Maximum | 14865 |
| Range | 14865 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 287.2087478 |
|---|---|
| Coefficient of variation (CV) | 5.77671459 |
| Kurtosis | 448.6697188 |
| Mean | 49.71835518 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 16.94097656 |
| Sum | 4804931 |
| Variance | 82488.86481 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 59980 | 62.1% | |
| 1 | 4572 | 4.7% | |
| 2 | 2691 | 2.8% | |
| 3 | 1891 | 2.0% | |
| 4 | 1371 | 1.4% | |
| 5 | 1168 | 1.2% | |
| 6 | 1050 | 1.1% | |
| 7 | 861 | 0.9% | |
| 8 | 772 | 0.8% | |
| 9 | 731 | 0.8% | |
| Other values (1702) | 21556 | 22.3% |
| Value | Count | Frequency (%) | |
| 0 | 59980 | 62.1% | |
| 1 | 4572 | 4.7% | |
| 2 | 2691 | 2.8% | |
| 3 | 1891 | 2.0% | |
| 4 | 1371 | 1.4% |
| Value | Count | Frequency (%) | |
| 14865 | 1 | < 0.1% | |
| 12903 | 1 | < 0.1% | |
| 11077 | 1 | < 0.1% | |
| 10763 | 1 | < 0.1% | |
| 10627 | 1 | < 0.1% |
| Distinct | 1627 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 58.85574744 |
|---|---|
| Minimum | 0 |
| Maximum | 129953 |
| Zeros | 36336 |
| Zeros (%) | 37.6% |
| Memory size | 755.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 2 |
| Q3 | 20 |
| 95-th percentile | 228 |
| Maximum | 129953 |
| Range | 129953 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 608.2756902 |
|---|---|
| Coefficient of variation (CV) | 10.33502617 |
| Kurtosis | 23311.62097 |
| Mean | 58.85574744 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 125.00983 |
| Sum | 5687996 |
| Variance | 369999.3152 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 36336 | 37.6% | |
| 1 | 8324 | 8.6% | |
| 2 | 4984 | 5.2% | |
| 3 | 3482 | 3.6% | |
| 4 | 2749 | 2.8% | |
| 5 | 2250 | 2.3% | |
| 6 | 1854 | 1.9% | |
| 7 | 1560 | 1.6% | |
| 8 | 1404 | 1.5% | |
| 9 | 1335 | 1.4% | |
| Other values (1617) | 32365 | 33.5% |
| Value | Count | Frequency (%) | |
| 0 | 36336 | 37.6% | |
| 1 | 8324 | 8.6% | |
| 2 | 4984 | 5.2% | |
| 3 | 3482 | 3.6% | |
| 4 | 2749 | 2.8% |
| Value | Count | Frequency (%) | |
| 129953 | 1 | < 0.1% | |
| 62103 | 1 | < 0.1% | |
| 39605 | 1 | < 0.1% | |
| 39213 | 1 | < 0.1% | |
| 34039 | 1 | < 0.1% |
age_group
Categorical
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 94.8 KiB |
| 21-30 | |
|---|---|
| 10-20 | |
| 31-40 | |
| 51-60 | |
| 41-50 | |
| Other values (5) |
| Value | Count | Frequency (%) | |
| 21-30 | 28610 | 29.6% | |
| 10-20 | 24714 | 25.6% | |
| 31-40 | 12481 | 12.9% | |
| 51-60 | 9287 | 9.6% | |
| 41-50 | 8960 | 9.3% | |
| 61-70 | 6828 | 7.1% | |
| 71-80 | 2234 | 2.3% | |
| >100 | 1525 | 1.6% | |
| 91-100 | 1201 | 1.2% | |
| 81-90 | 803 | 0.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 4.996647455 |
| Min length | 4 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | userid | age | dob_day | dob_year | dob_month | gender | tenure | friend_count | friendships_initiated | likes | likes_received | mobile_likes | mobile_likes_received | www_likes | www_likes_received | age_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 2094382 | 14 | 19 | 1999 | 11 | male | 266 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 1 | 1 | 1192601 | 14 | 2 | 1999 | 11 | female | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 2 | 2 | 2083884 | 14 | 16 | 1999 | 11 | male | 13 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 3 | 3 | 1203168 | 14 | 25 | 1999 | 12 | female | 93 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 4 | 4 | 1733186 | 14 | 4 | 1999 | 12 | male | 82 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 5 | 5 | 1524765 | 14 | 1 | 1999 | 12 | male | 15 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 6 | 6 | 1136133 | 13 | 14 | 2000 | 1 | male | 12 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 7 | 8 | 1365174 | 13 | 1 | 2000 | 1 | male | 81 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 8 | 9 | 1712567 | 13 | 2 | 2000 | 2 | male | 171 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
| 9 | 10 | 1612453 | 13 | 22 | 2000 | 2 | male | 98 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10-20 |
Last rows
| df_index | userid | age | dob_day | dob_year | dob_month | gender | tenure | friend_count | friendships_initiated | likes | likes_received | mobile_likes | mobile_likes_received | www_likes | www_likes_received | age_group | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 96633 | 98993 | 1654565 | 19 | 15 | 1994 | 8 | male | 394 | 4538 | 4144 | 4501 | 15088 | 4435 | 5961 | 66 | 9127 | 10-20 |
| 96634 | 98994 | 2063006 | 20 | 4 | 1993 | 1 | female | 402 | 1988 | 332 | 7351 | 106025 | 7248 | 73333 | 103 | 32692 | 10-20 |
| 96635 | 98995 | 1132164 | 20 | 9 | 1993 | 10 | female | 699 | 3611 | 973 | 4507 | 7768 | 4414 | 6909 | 93 | 859 | 10-20 |
| 96636 | 98996 | 1668695 | 24 | 25 | 1989 | 4 | female | 182 | 2938 | 1272 | 6018 | 17765 | 5843 | 11708 | 175 | 6057 | 21-30 |
| 96637 | 98997 | 1458985 | 28 | 14 | 1985 | 12 | female | 290 | 2218 | 1618 | 4626 | 10268 | 4290 | 4250 | 336 | 6018 | 21-30 |
| 96638 | 98998 | 1268299 | 68 | 4 | 1945 | 4 | female | 541 | 2118 | 341 | 3996 | 18089 | 3505 | 11887 | 491 | 6202 | 61-70 |
| 96639 | 98999 | 1256153 | 18 | 12 | 1995 | 3 | female | 21 | 1968 | 1720 | 4401 | 13412 | 4399 | 10592 | 2 | 2820 | 10-20 |
| 96640 | 99000 | 1195943 | 15 | 10 | 1998 | 5 | female | 111 | 2002 | 1524 | 11959 | 12554 | 11959 | 11462 | 0 | 1092 | 10-20 |
| 96641 | 99001 | 1468023 | 23 | 11 | 1990 | 4 | female | 416 | 2560 | 185 | 4506 | 6516 | 4506 | 5760 | 0 | 756 | 21-30 |
| 96642 | 99002 | 1397896 | 39 | 15 | 1974 | 5 | female | 397 | 2049 | 768 | 9410 | 12443 | 9410 | 9530 | 0 | 2913 | 31-40 |